SPU: Update CELL Communication Performance #17646

elad335 · 2025-10-31T06:57:16Z

Optimizations:

cellRtc functions were called non-stop by the PPU thread which use sys_memory_get_page_attribute internally.
The writer lock in sys_memory_get_page_attribute was causing SPUs to wait unjustly.
Fast path without locking VM has been added to sys_rsx_context_iomap.
Exclude spu_thread::reservation_check address receptacle from writer_lock detection and waiting.
Fix spu_thread::reservation_check(hash) overload for main and stack memory.
Add fast path without checking allocation for spu_thread::reservation_check when the address is on the same page as GETLLAR's effective address.

Fixes #14724

elad335 · 2025-11-02T07:32:22Z

Added "SPURS oriented thread waiting" which is gonna replace "Preferred SPU Threads" setting and be active by default.

elad335 · 2025-11-02T09:18:53Z

Special settings tweaked from default:

Clocks Scale: 800%
Vblank Frequency: 480Hz
Frame Limit: Off
Preferred SPU Threads: Auto (as default)

PR:

Master

digant73 · 2025-11-02T10:12:20Z

performance are regressed on some games such as KZ3

PR:

MASTER:

kd-11 · 2025-11-02T12:31:11Z

rpcs3/Emu/Cell/lv2/sys_rsx.cpp


+	constexpr u32 _1m = 1u << 20;
+
+	std::unique_lock fast_lock(render->sys_rsx_mtx, std::defer_lock);


I don't see the benefit of the double check under lock especially since we expect frame-to-frame the mappings wont actually change. Why not just lock and check once? I feel that would be faster here.

kd-11 · 2025-11-02T12:45:36Z

rpcs3/Emu/Cell/SPUThread.cpp

+					break;
+				}
+
+				const u64 current = get_system_time();


I've pointed it out before, but get_system_time is unreasonably heavy. Prefer TSC unless real-world precise values are required.

A general note - the spu_info logic (test_and_update_atomic_op_info) in general is quite heavy-handed with all the atomic ops and may eat into performance. The biggest issue I see is that there is no fast-path through this calling sequence (and the corresponding one below). Yes, spurs itself is going to be almost always running task groups but we also observe that in most games the parallel misses themselves aren't too bad with modern processors, though I agree we need something more sophisticated than the quick hack that was the preferred threads option.
This is all theory of course, we'll just have to see if it ends up worth the overhead with the big hitters like RDR, TLOU or killzone titles.

kd-11 · 2025-11-02T12:48:32Z

rpcs3/Emu/Cell/SPUThread.cpp

+
+	spu_info[index].release(info);
+
+	for (usz i = 0; i < spu_info.size(); i++)


I think we can abuse vector ops for this sequence and gain implicit atomicity.
Have the spu info as an object of arrays instead of array of objects.
Then you can just load all of them at once and (ab)use vector ops on the vector to figure out how much overlap there is.

On x86 at least, vector ops are atomic as long as they are naturally aligned too so we basically get that for free.

MsDarkLow · 2025-11-02T16:27:38Z

i9-13900K | RTX 3080
Ratchet & Clank Tools of Destruction - Improved a little bit. 88.3 -> 92.5
God of War 3 - Very small Regression. 62.4 -> 61.0
Sonic Unleashed - Tremendously reduced performance. From a stable 42.4 to very unstable 32.4

Ratchet & Clank

God of War 3

Sonic Unleahsed

elad335 · 2025-11-06T15:58:12Z

I've pushed an experimental update, please test. If it works I'll put it under a special setting.

MsDarkLow · 2025-11-06T18:25:06Z

i9-13900K | RTX 3080
After these updates, all four games I've tested before have improved!
Ratchet & Clank Tools of Destruction - 88.3 -> 96.0 (earlier in the pr was at 92.0)
God of War 3 - 62.0 -> 64.0 (earlier in the pr it was 61.0)
Sonic Unleashed - 42.4 -> 43.4 (earlier in the pr it was 32.4)
Metal Gear Soils 4 - 95.4 -> 96.3

Ratchet & Clank

God of War 3

Sonic Unleahsed

Metal Gear Soild 4

digant73 · 2025-11-06T18:30:26Z

kz3 is still very bad (40 fps vs 69). GPU usage in particular is very low.

EDIT: attached also the log

RPCS3.zip

digant73 · 2025-11-06T19:10:41Z

bad performance also in infamous 2 demo (50 fps vs 80). the game also crashes with the following error:

PR:

MASTER:

Megamouse · 2025-11-06T19:20:31Z

rpcs3/Emu/Cell/Modules/cellCamera.cpp

-	std::lock_guard lock(g_camera.mutex);

-	*info = g_camera.info;
+	CellCameraInfoEx info_out;


Why is this change needed?
Zero comments

We shouldn't touch VM memory under mutex for few reasons. (RSX access violations lengthens the duration of the lock for example)
We can put it in the coding guidelines.
There is no need to comment it each time.

Then a wrapper construct makes more sense, otherwise it will be repeated again elsewhere. Or maybe unlocked probe_for_read / probe_for_write makes more sense like usually done in real drivers.

elad335 added CPU Bugfix Optimization Optimizes existing code labels Oct 31, 2025

elad335 mentioned this pull request Oct 31, 2025

[Regression] Performance Regressions in Hot Shots Golf: Out of Bounds (#11904, #12523) #14724

Open

elad335 force-pushed the gamer branch from 45b23b1 to 4f2328f Compare October 31, 2025 08:52

elad335 requested a review from kd-11 October 31, 2025 10:02

elad335 force-pushed the gamer branch 3 times, most recently from a6a2fc0 to 9f8c31f Compare November 2, 2025 07:28

elad335 changed the title ~~vm/sys_memory: Remove VM locking in sys_memory_get_page_attribute~~ SPU: SPURS oriented thread waiting Nov 2, 2025

elad335 force-pushed the gamer branch 2 times, most recently from cafffb4 to 8d667a9 Compare November 2, 2025 09:09

kd-11 reviewed Nov 2, 2025

View reviewed changes

elad335 added 11 commits November 5, 2025 20:01

vm/sys_memory: Remove VM locking in sys_memory_get_page_attribute

50bb245

sys_rsx: Add fast path for sys_rsx_context_iomap

318909e

sys_rsx: Avoid modifying arguments

03b05fe

vm.cpp: Minor optimization

7e198ea

SPU: Exclude reservation_check address receptacle from writer_lock

516f4b1

Add wait flag in cellGemConvertVideoFinish

ea92b4a

SPU: Optimize check for args sharing page with Effective-Address

cb05ceb

SPU: Fix spu_thread::reservation_check(hash)

2ff3c7d

vm: Keep clearing to_clear

e14a1d3

cellGam: Add more wait flags

119c679

More logging to sys_memory_allocate_from_container

2cbf431

elad335 added 2 commits November 6, 2025 17:51

SPU/vm: Suspend PPUs AFTER SPU DMA locks clearing

8722501

Wait flag for cellCameraGetBufferInfoEx

65fd004

elad335 force-pushed the gamer branch from 8d667a9 to e3eb591 Compare November 6, 2025 15:54

elad335 added 2 commits November 6, 2025 17:56

cpu_flag::measure

e936f38

SPU: Experimental

618ffa3

elad335 force-pushed the gamer branch from e3eb591 to 618ffa3 Compare November 6, 2025 15:57

elad335 changed the title ~~SPU: SPURS oriented thread waiting~~ SPU: Update CELL Communication Performance Nov 6, 2025

Megamouse reviewed Nov 6, 2025

View reviewed changes

EdHerdman mentioned this pull request Nov 6, 2025

Carnivores HD [NPUB30869] texture loads fail, leading to purple objects ingame #14475

Closed


		constexpr u32 _1m = 1u << 20;

		std::unique_lock fast_lock(render->sys_rsx_mtx, std::defer_lock);


		spu_info[index].release(info);

		for (usz i = 0; i < spu_info.size(); i++)

Uh oh!

SPU: Update CELL Communication Performance #17646

Are you sure you want to change the base?

SPU: Update CELL Communication Performance #17646

Conversation

elad335 commented Oct 31, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

elad335 commented Nov 2, 2025

Uh oh!

elad335 commented Nov 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR:

Master

Uh oh!

digant73 commented Nov 2, 2025

Uh oh!

kd-11 Nov 2, 2025

Choose a reason for hiding this comment

Uh oh!

kd-11 Nov 2, 2025

Choose a reason for hiding this comment

Uh oh!

kd-11 Nov 2, 2025

Choose a reason for hiding this comment

Uh oh!

MsDarkLow commented Nov 2, 2025

Uh oh!

elad335 commented Nov 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

MsDarkLow commented Nov 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

digant73 commented Nov 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

digant73 commented Nov 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR:

MASTER:

Uh oh!

Megamouse Nov 6, 2025

Choose a reason for hiding this comment

Uh oh!

elad335 Nov 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

kd-11 Nov 7, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

elad335 commented Oct 31, 2025 •

edited

Loading

elad335 commented Nov 2, 2025 •

edited

Loading

elad335 commented Nov 6, 2025 •

edited

Loading

MsDarkLow commented Nov 6, 2025 •

edited

Loading

digant73 commented Nov 6, 2025 •

edited

Loading

digant73 commented Nov 6, 2025 •

edited

Loading

elad335 Nov 6, 2025 •

edited

Loading